-
Notifications
You must be signed in to change notification settings - Fork 589
Create new OP_MULTIPARAM
to implement subroutine signatures
#23574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
Weirdly, while github CI and most commentors on IRC report test failures in In case it's somehow related to build options, the script I use to configure and build is: test -f config.sh && rm config.sh
test -f Policy.sh && rm Policy.sh
./Configure -des \
-Dusedevel \
-Dprefix=$HOME/perl5/perlbrew/perls/bleadperl/ \
-DDEBUGGING \
-Uversiononly \
-Dman1dir=none -Dman3dir=none \
-Dusethreads \
-Dinc_version_list=none \
-Doptimize='-gdwarf-2 -g3' \
"$@"
make -j8
make -j8 test_prep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message: "This two" should be "These two"
op.c
Outdated
*/ | ||
varop->op_next = defop; | ||
defexpr->op_next = varop; | ||
param->padix = allocmy("", 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perl_allocmy
currently does not support empty names because it blindly uses name[1]
. It needs to be changed to check len
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocmy()
is just a wrapper around pad_add_name_pvn()
with some extra sanity checking and flag-setting (I think it exists just for a convenience from the parser). I've instead changed this code to call pad_add_name_pvn()
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding more asserts in #23583 which would have caught this
482ac37
to
e961b24
Compare
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: Perl#23574 (comment)
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: #23574 (comment)
The previous names were just copied from similar fields in previous signature handling code. These new names are a better fit for their current and intended future purpose.
By defining a helper term for optional defaulting expression we can remove one of the three alternate cases down to just two. This will help when adding more code to these action blocks in future,
These two new helper functions will be useful for sharing behaviour with upcoming code, when adding OP_MULTIPARAM.
…gument processing The specific behaviour of this obscure modification case is not documented or relied upon elsewhere in code. Rather than attempt to preserve this cornercase, it's easier just to no longer test for it, as upcoming changes will alter the values that are visible.
Creates a new UNOP_AUX op type, `OP_MULTIPARAM`, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops like `OP_MULTIDEREF` and `OP_MULTICONCAT` where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation. Also adds a LOGOP, `OP_PARAMTEST` and UNOP `OP_PARAMSTORE` which are responsible for implementing the default expressions of optional parameters. These use the `SvPADSTALE` flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack. This change is carefully designed to support two future ideas: * Named parameters as per PPC0024 https://github.com/Perl/PPCs/blob/main/ppcs/ppc0024-signature-named-parameters.md * "no-snails"; the performance optimisation that avoids storing incoming argument values in the `@_` AV and instead consumes them directly from the stack
e961b24
to
5dd6e3a
Compare
EXTEND(SP, (IV)(3 + nparams + 1)); | ||
mPUSHu(p->min_args); | ||
mPUSHu(p->n_positional); | ||
PUSHs(sv_2mortal(p->slurpy ? newSVpvf("%c", p->slurpy) : &PL_sv_no)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why run a 1 byte long string through a printf engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because offhand I don't believe we have a newSVpvc
function. Can you suggest an alternative?
{ | ||
struct op_multiparam_aux *p = (struct op_multiparam_aux *)aux; | ||
UV nparams = p->n_positional; | ||
EXTEND(SP, (IV)(3 + nparams + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the MEXTEND mortal stack stretcher macro here too.
for(UV parami = 0; parami < nparams; parami++) | ||
mPUSHu(p->param_padix[parami]); | ||
mPUSHu(p->slurpy_padix); | ||
XSRETURN(3 + nparams + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with PUTBACK; return;
if(namepv) | ||
sv_catpvf(retsv, ":%s=%" UVf, namepv, paramidx); | ||
else | ||
sv_catpvf(retsv, ":(anon)=%" UVf, paramidx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define PadnamePV(pn) (pn)->xpadn_pv
#define PadnameLEN(pn) (pn)->xpadn_len
#define PadnameUTF8(pn) 1
#define PadnameSV(pn) newSVpvn_flags(PadnamePV(pn), PadnameLEN(pn), SVs_TEMP|SVf_UTF8)
":(anon)=%"
can be de-duped to 1 branch above with ":(%s)=%". "anon",
in theory PadnameLEN(pn)
and sv_catXXX()
family calls should be taken advantage of here, rather than repeatedly going through the printf engine.
IDK and IDC enough, and probably its impossible to write a bug ticket with a failure/defect demo, about it, but I see UTF8 flag is perma-on in the data source API, but we arent propagating it to the higher level. And this is XS::APItest anyways, so perfection isnt critical. but someone might look at this in the future for "best practices" ideas, and then copy paste quicky hacky code into a more visible API.
# Look for OP_NULL[OP_PARAMTEST[OP_PARAMSTORE]] | ||
if ($o->name eq 'null' and $o->flags & OPf_KIDS and | ||
$o->first->name eq 'paramtest' and | ||
$o->first->first->name eq 'paramstore') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$o->first
cache the retval of the method or key ($first = $o->first)->name
.
|
||
length $sig[$parami] > 1 ? | ||
( $sig[$parami] .= ' ' ) : | ||
( $sig[$parami] = '$' ); # intentionally no trailing space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont look up same thing over and over in an array, do a $sref = \$sig[$parami];
or $s = $sig[$parami];
. probably the first is easier in this sub.
|
||
my $defop = $paramtest->first->first; | ||
if ($defop->name eq "stub") { | ||
$sig[$parami] .= "$assign"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y the ""s?
char slurpy; /* the sigil of the slurpy var (or null) */ | ||
OP *elemops; /* NULL, or an OP_LINESEQ of individual element and fence ops */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move field char slurpy; to the bottom of the struct. don't have hidden alignment holes.
* zero or more arguments. | ||
*/ | ||
UV next_argix; /* the argument index of the next parameter we add */ | ||
UV opt_params; /* number of optional scalar parameters */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using 64 bit integers to store an arrays length on i386 CPUs? Since when is void *
8 bytes long on a Pentium 3?
if(signature->elemops) | ||
op_free(signature->elemops); | ||
if(signature->params) { | ||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i386 doesnt have 8 byte pointers, use Size_t.
/* handle '$=' special case */ | ||
if(varop) | ||
if(padix) | ||
yyerror("Optional parameter lacks default expression"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was struct yy_parser_signature_param *param = subsignature_push_param();
leaked if this error branch was entered?
UV end_argix = signature->next_argix; | ||
|
||
struct op_multiparam_aux *aux = (struct op_multiparam_aux *)PerlMemShared_malloc( | ||
sizeof(struct op_multiparam_aux) + end_argix * sizeof(PADOFFSET)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I know PEMDAS but a () would be nice and make it easier to read, on what gets multiplied.
/* Move the entire chain of kid ops in one go */ | ||
OpMORESIB_set(cLISTOPx(sigops)->op_last, fenceops->op_first); | ||
cLISTOPx(sigops)->op_last = fenceops->op_last; | ||
OpLASTSIB_set(cLISTOPx(sigops)->op_last, sigops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out cLISTOPx(sigops)
to a C auto i suspect its many derefs deep and keeps being re-read after each of these assignments in -O1/-O2. cLISTOPx(sigops)->op_last
is written so many times its a puzzle what happened to the old target ptr struct, and the new target ptr, and is new or old target inside the container after this block.
struct yy_parser_signature_param *params = signature->params; | ||
UV max_argix = 0; | ||
|
||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no 64 bit pointers on ARM32/i386.
PADOFFSET padix = param->padix; | ||
|
||
while(max_argix < argix) { | ||
aux->param_padix[max_argix] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't double deref in a loop. CC has no idea if &(aux->param_padix)
and &(aux->param_padix[5])
happened to be the same mem addr.
IDK what realistic values are for max_argix, but if its over 16 x U32/U64 or 32 x U32/U64 call libc's memset aka perl Zero(), below that, this loop will win in speed b/c its move 4/8 aligned bytes at a time, while all libc memsets, need 4-8 branches to figure out alignment and if to use 2/4/8/16 sse/32 avx/64 avx512 loops.
OP *o = newUNOP(OP_NULL, 0, paramtest); | ||
|
||
paramstore->op_next = o; | ||
paramtest->op_next = o; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these zeroings really needed? use a step debugger to find out whats inside. are they already zeroed b/c of their earlier allocator() always zering new structs? is it uninit memory? was there a resource tracked ptr here that we need to free in this slot before zering?
UV n_positional; /* = the number of mandatory + optional scalar parameters, not counting a final slurpy */ | ||
char slurpy; | ||
PADOFFSET *param_padix; /* points at storage allocated along with the struct itself, immediately following */ | ||
PADOFFSET slurpy_padix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort all struct fields by alignment and size, dont use 64 bit ptrs on 32 bit CPUs
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ | ||
for(IV ix = startix; ix < endix; ix++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AV*s use SSize_t not IV.
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ | ||
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless var av is reachable from PP lang, use fetch_simple() variants, we dont need to call TIEARRAY
PP methods here.
{ | ||
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes alot of work for sv_setsv_flags()
to figure out &PL_sv_undef
is empty. This isn't Chrome or NodeJS, its C. It won't constant fold. Grep the perl repo for the correct way to make a new read write undef SV*
.
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, | ||
(SV_DO_COW_SVSETSV|SV_NOSTEAL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SV_DO_COW_SVSETSV good idea
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, | ||
(SV_DO_COW_SVSETSV|SV_NOSTEAL)); | ||
if(!av_store(av, ix, newsv)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is av a TIEARRAY? Is this really going to return false? Why are we continuing execution and not throwing a panic:/heap corruption/OOM? Can it be a TIEARRAY? why not use av_simpleX() family of calls here?
Renew() Newx() handles all OOMs NULL RETVALS for us unless we explicitly temporarily disabled that, which is done approximately in only 1 place inside the entire P5P repo, and ~1-2 places on grep CPAN.
#define av_refresh_elements_range(av, startix, endix) S_av_refresh_elements_range(aTHX_ av, startix, endix) | ||
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the av_extend()? so we aren't constantly going inside libc's realloc every 1-5 SVs. I forgot the exact increment unit av_store() does on a grow event, but even wanting know that count is UB/bad coding practices. av_extend() it once. We know the max length, we aren't iterating a HV or a SQL DB or a readdir() or something off an AJAX socket here.
|
||
UV parami; | ||
for(parami = 0; parami < nparams; parami++) { | ||
PADOFFSET padix = aux->param_padix[parami]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out of the loop reading's struct aux's param_padix field over and over , nothing will be reallocing it afaik
} | ||
|
||
SV **valp = av_fetch(defav, parami, FALSE); | ||
SV *val = valp ? *valp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is valp retval going to be NULL?
SV*
val should be set to NULL instead of &PL_sv_undef
here if SV**
was NULL.
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
SvSetMagicSV(*padentry, val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro
SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
Also from the comment above, if right side SV is NULL, there is a better way to take an existing SV ptr and set it to undef. grep the repo's root .c and .h files to learn how. Since you want to fire SetMagic here (why and when can there be SMG here?), make sure the correct set to undef macro or fn either fires SMG for you or fire SMG yourself right after it returns.
|
||
if(av_count(av)) { | ||
/* see "target should be empty" comments in pp_argelem above */ | ||
av_refresh_elements_range(defav, parami, parami + argc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should retval of av_count(av) be passed as arg 4 to av_refresh_elements_range()? does av_refresh_elements_range() need that info? dont look it up twice. IDK enough to say what to do here.
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
av_store(av, avidx++, newSVsv(val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newSVsv is not cow aware, see your code from above how to make it cow aware. dont use 64b ptrs on i386/arm32. newSVsv(&PL_sv_undef) is not how to alloc a new SV with value undef. use av_simple variants unless TIEARRAY can happen here.
argc -= 2; | ||
|
||
if (UNLIKELY(SvGMAGICAL(key))) | ||
key = sv_mortalcopy(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define sv_mortalcopy(sv) Perl_sv_mortalcopy_flags(aTHX_ sv, SV_GMAGIC|SV_DO_COW_SVSETSV)
looks good on this line
SV **svp; | ||
|
||
svp = av_fetch(defav, parami, FALSE); parami++; | ||
SV *key = svp ? *svp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What on earth does hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
mean?
how about executing hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
5 or 9 times in a row, each time with a different addr in var "val"? what does that mean? why is the previous sentence valid correct behavior and not a panic/heap corruption?
|
||
if(!ok) | ||
return cLOGOP->op_other; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic tree makes no sense. test each c auto or malloc backed variable exactly once in this if/else tree. var
OP*
PL_op should be saved to a C auto at the very top of this PP func to stop multiple rereads.
PL_op is an lvalue but it is not a variable. If you believe its a C variable goto the top of pp.c and add this line. it will be harmless.
#include "EXTERN.h"
#define PERL_IN_PP_C
#include "perl.h"
#include "keywords.h"
#include "invlist_inline.h"
#include "reentr.h"
#include "regcharclass.h"
#undef PL_op /* <<<<< Add this line and push this commit to blead, its a C variable right? */
/* variations on pp_null */
PP(pp_stub)
Creates a new UNOP_AUX op type,
OP_MULTIPARAM
, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops likeOP_MULTIDEREF
andOP_MULTICONCAT
where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation.Also adds a LOGOP,
OP_PARAMTEST
and UNOPOP_PARAMSTORE
which are responsible for implementing the default expressions of optional parameters. These use theSvPADSTALE
flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack.This change is carefully designed to support two future ideas:
Named parameters as per PPC0024
"no-snails"; the performance optimisation that avoids storing incoming argument values in the
@_
AV and instead consumes them directly from the stack--
This set of changes currently does not have a perldelta entry, because it doesn't directly make any changes that are end-user visible. However, since it is quite a significant internal change, I could be talked into writing an entry in the "internal changes" section anyway. Reviewers: Thoughts?